{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "c014a89a-05f5-446a-bc53-dd048c6c4997",
   "metadata": {},
   "source": [
    "## 1. LangChain 中的文档查询\n",
    "\n",
    "在完成文档拆分加载后，为了能够根据问题查询相关的文档片段，我们需要把前面拆分的文档片段，分别使用embedding(嵌入模型)计算文本特征向量，然后存储到向量数据库中。\n",
    "\n",
    "在Langchain 中的通过矢量嵌入类（embeddings）和向量数据库类（vectorstores）来实现文档的查询，本文将详细介绍如何通过embeddings和vectorstores实现txt、markdown格式文档的查询。\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eb420814-0aaf-4244-9c4b-70e087c509d7",
   "metadata": {},
   "source": [
    "## 2. 案例体验\n",
    "🔹 本案例需使用 P100 及以上规格运行，请确保运行规格一致，可按照下图切换规格。\n",
    "\n",
    "![](https://modelarts-labs-bj4-v2.obs.cn-north-4.myhuaweicloud.com/case_zoo/chatglm3/image/1.png)\n",
    "\n",
    "🔹 点击Run in ModelArts，将会进入到ModelArts CodeLab中，这时需要你登录华为云账号，如果没有账号，则需要注册一个，且要进行实名认证，参考[《ModelArts准备工作_简易版》](https://developer.huaweicloud.com/develop/aigallery/article/detail?id=4ce709d6-eb25-4fa4-b214-e2e5d6b7919c) 即可完成账号注册和实名认证。 登录之后，等待片刻，即可进入到CodeLab的运行环境\n",
    "\n",
    "🔹 出现 Out Of Memory ，请检查是否为您的参数配置过高导致，修改参数配置，重启kernel或更换更高规格资源进行规避❗❗❗"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9a579a34-88f2-4fe3-94f1-88a6597a4a13",
   "metadata": {},
   "source": [
    "### 2.1 下载模型和数据"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "11b1bd3e-e804-49fe-b51f-2b535ddf6eb1",
   "metadata": {},
   "source": [
    "下载nltk_data数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "9ac410a0-8844-47fc-b837-3a8549bc8ae8",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:root:Using MoXing-v2.1.0.5d9c87c8-5d9c87c8\n",
      "\n",
      "INFO:root:Using OBS-Python-SDK-3.20.9.1\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "import moxing as mox\n",
    "\n",
    "work_dir = '/home/ma-user/work'\n",
    "obs_path = 'obs://dtse-models/tar-models/nltk_data.tar'\n",
    "ma_path = os.path.join(work_dir, 'nltk_data.tar')\n",
    "mox.file.copy(obs_path, ma_path)\n",
    "\n",
    "mox.file.copy_parallel('obs://modelarts-labs-bj4-v2/case_zoo/langchain-ChatGLM/file/docs','/home/ma-user/work/docs')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bf52e9db-3d99-452f-807d-38a048ac4db0",
   "metadata": {},
   "source": [
    "进入nltk_data目录，解压数据压缩包"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "130fdc75-d9e6-4971-96af-74bc32f5e6f4",
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/home/ma-user/work\n",
      "\n",
      "nltk_data/\n",
      "\n",
      "nltk_data/misc/\n",
      "\n",
      "nltk_data/misc/mwa_ppdb.zip\n",
      "\n",
      "nltk_data/misc/perluniprops.xml\n",
      "\n",
      "nltk_data/misc/mwa_ppdb.xml\n",
      "\n",
      "nltk_data/misc/perluniprops.zip\n",
      "\n",
      "nltk_data/tokenizers/\n",
      "\n",
      "nltk_data/tokenizers/punkt/\n",
      "\n",
      "nltk_data/tokenizers/punkt/french.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/polish.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/.DS_Store\n",
      "\n",
      "nltk_data/tokenizers/punkt/portuguese.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/german.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/swedish.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/malayalam.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/estonian.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/finnish.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/spanish.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/french.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/polish.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/portuguese.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/german.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/swedish.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/malayalam.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/estonian.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/finnish.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/spanish.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/danish.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/greek.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/russian.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/slovene.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/norwegian.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/english.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/dutch.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/italian.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/czech.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/README\n",
      "\n",
      "nltk_data/tokenizers/punkt/PY3/turkish.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/danish.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/greek.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/russian.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/slovene.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/norwegian.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/english.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/dutch.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/italian.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/czech.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt/README\n",
      "\n",
      "nltk_data/tokenizers/punkt/turkish.pickle\n",
      "\n",
      "nltk_data/tokenizers/punkt.zip\n",
      "\n",
      "nltk_data/tokenizers/punkt.xml\n",
      "\n",
      "nltk_data/sentiment/\n",
      "\n",
      "nltk_data/sentiment/vader_lexicon.xml\n",
      "\n",
      "nltk_data/sentiment/vader_lexicon.zip\n",
      "\n",
      "nltk_data/chunkers/\n",
      "\n",
      "nltk_data/chunkers/maxent_ne_chunker.xml\n",
      "\n",
      "nltk_data/chunkers/maxent_ne_chunker.zip\n",
      "\n",
      "nltk_data/help/\n",
      "\n",
      "nltk_data/help/tagsets.xml\n",
      "\n",
      "nltk_data/help/tagsets.zip\n",
      "\n",
      "nltk_data/taggers/\n",
      "\n",
      "nltk_data/taggers/averaged_perceptron_tagger.xml\n",
      "\n",
      "nltk_data/taggers/averaged_perceptron_tagger.zip\n",
      "\n",
      "nltk_data/taggers/universal_tagset.xml\n",
      "\n",
      "nltk_data/taggers/averaged_perceptron_tagger_ru.zip\n",
      "\n",
      "nltk_data/taggers/averaged_perceptron_tagger/\n",
      "\n",
      "nltk_data/taggers/averaged_perceptron_tagger/averaged_perceptron_tagger.pickle\n",
      "\n",
      "nltk_data/taggers/maxent_treebank_pos_tagger.zip\n",
      "\n",
      "nltk_data/taggers/universal_tagset.zip\n",
      "\n",
      "nltk_data/taggers/averaged_perceptron_tagger_ru.xml\n",
      "\n",
      "nltk_data/taggers/maxent_treebank_pos_tagger.xml\n",
      "\n",
      "nltk_data/corpora/\n",
      "\n",
      "nltk_data/corpora/cess_cat.xml\n",
      "\n",
      "nltk_data/corpora/twitter_samples.xml\n",
      "\n",
      "nltk_data/corpora/udhr.zip\n",
      "\n",
      "nltk_data/corpora/ieer.zip\n",
      "\n",
      "nltk_data/corpora/senseval.xml\n",
      "\n",
      "nltk_data/corpora/nonbreaking_prefixes.zip\n",
      "\n",
      "nltk_data/corpora/mac_morpho.xml\n",
      "\n",
      "nltk_data/corpora/mte_teip5.xml\n",
      "\n",
      "nltk_data/corpora/omw.zip\n",
      "\n",
      "nltk_data/corpora/reuters.xml\n",
      "\n",
      "nltk_data/corpora/shakespeare.zip\n",
      "\n",
      "nltk_data/corpora/propbank.zip\n",
      "\n",
      "nltk_data/corpora/rte.zip\n",
      "\n",
      "nltk_data/corpora/ptb3.zip\n",
      "\n",
      "nltk_data/corpora/kimmo.xml\n",
      "\n",
      "nltk_data/corpora/names.xml\n",
      "\n",
      "nltk_data/corpora/comparative_sentences.zip\n",
      "\n",
      "nltk_data/corpora/ycoe.zip\n",
      "\n",
      "nltk_data/corpora/nps_chat.zip\n",
      "\n",
      "nltk_data/corpora/sentiwordnet.zip\n",
      "\n",
      "nltk_data/corpora/smultron.zip\n",
      "\n",
      "nltk_data/corpora/lin_thesaurus.zip\n",
      "\n",
      "nltk_data/corpora/jeita.zip\n",
      "\n",
      "nltk_data/corpora/chat80.xml\n",
      "\n",
      "nltk_data/corpora/knbc.zip\n",
      "\n",
      "nltk_data/corpora/sentence_polarity.xml\n",
      "\n",
      "nltk_data/corpora/conll2007.zip\n",
      "\n",
      "nltk_data/corpora/panlex_swadesh.xml\n",
      "\n",
      "nltk_data/corpora/verbnet3.zip\n",
      "\n",
      "nltk_data/corpora/wordnet2022.zip\n",
      "\n",
      "nltk_data/corpora/nombank.1.0.xml\n",
      "\n",
      "nltk_data/corpora/comtrans.xml\n",
      "\n",
      "nltk_data/corpora/stopwords.xml\n",
      "\n",
      "nltk_data/corpora/verbnet3.xml\n",
      "\n",
      "nltk_data/corpora/conll2002.xml\n",
      "\n",
      "nltk_data/corpora/omw-1.4.zip\n",
      "\n",
      "nltk_data/corpora/qc.zip\n",
      "\n",
      "nltk_data/corpora/ppattach.xml\n",
      "\n",
      "nltk_data/corpora/machado.xml\n",
      "\n",
      "nltk_data/corpora/wordnet2021.xml\n",
      "\n",
      "nltk_data/corpora/stopwords.zip\n",
      "\n",
      "nltk_data/corpora/framenet_v15.xml\n",
      "\n",
      "nltk_data/corpora/ptb.zip\n",
      "\n",
      "nltk_data/corpora/lin_thesaurus.xml\n",
      "\n",
      "nltk_data/corpora/product_reviews_1.zip\n",
      "\n",
      "nltk_data/corpora/cess_esp.zip\n",
      "\n",
      "nltk_data/corpora/words.xml\n",
      "\n",
      "nltk_data/corpora/nps_chat.xml\n",
      "\n",
      "nltk_data/corpora/reuters.zip\n",
      "\n",
      "nltk_data/corpora/brown_tei.xml\n",
      "\n",
      "nltk_data/corpora/toolbox.zip\n",
      "\n",
      "nltk_data/corpora/swadesh.xml\n",
      "\n",
      "nltk_data/corpora/ppattach.zip\n",
      "\n",
      "nltk_data/corpora/opinion_lexicon.zip\n",
      "\n",
      "nltk_data/corpora/semcor.zip\n",
      "\n",
      "nltk_data/corpora/cmudict.xml\n",
      "\n",
      "nltk_data/corpora/machado.zip\n",
      "\n",
      "nltk_data/corpora/subjectivity.zip\n",
      "\n",
      "nltk_data/corpora/pl196x.zip\n",
      "\n",
      "nltk_data/corpora/city_database.xml\n",
      "\n",
      "nltk_data/corpora/treebank.zip\n",
      "\n",
      "nltk_data/corpora/pil.zip\n",
      "\n",
      "nltk_data/corpora/mac_morpho.zip\n",
      "\n",
      "nltk_data/corpora/masc_tagged.xml\n",
      "\n",
      "nltk_data/corpora/universal_treebanks_v20.zip\n",
      "\n",
      "nltk_data/corpora/genesis.xml\n",
      "\n",
      "nltk_data/corpora/indian.xml\n",
      "\n",
      "nltk_data/corpora/timit.xml\n",
      "\n",
      "nltk_data/corpora/state_union.zip\n",
      "\n",
      "nltk_data/corpora/framenet_v17.xml\n",
      "\n",
      "nltk_data/corpora/opinion_lexicon.xml\n",
      "\n",
      "nltk_data/corpora/problem_reports.xml\n",
      "\n",
      "nltk_data/corpora/wordnet.zip\n",
      "\n",
      "nltk_data/corpora/framenet_v17.zip\n",
      "\n",
      "nltk_data/corpora/subjectivity.xml\n",
      "\n",
      "nltk_data/corpora/sentence_polarity.zip\n",
      "\n",
      "nltk_data/corpora/semcor.xml\n",
      "\n",
      "nltk_data/corpora/city_database.zip\n",
      "\n",
      "nltk_data/corpora/unicode_samples.xml\n",
      "\n",
      "nltk_data/corpora/bcp47.xml\n",
      "\n",
      "nltk_data/corpora/dolch.zip\n",
      "\n",
      "nltk_data/corpora/ptb.xml\n",
      "\n",
      "nltk_data/corpora/switchboard.xml\n",
      "\n",
      "nltk_data/corpora/wordnet2021.zip\n",
      "\n",
      "nltk_data/corpora/extended_omw.zip\n",
      "\n",
      "nltk_data/corpora/framenet_v15.zip\n",
      "\n",
      "nltk_data/corpora/genesis.zip\n",
      "\n",
      "nltk_data/corpora/movie_reviews.zip\n",
      "\n",
      "nltk_data/corpora/abc.zip\n",
      "\n",
      "nltk_data/corpora/dependency_treebank.zip\n",
      "\n",
      "nltk_data/corpora/unicode_samples.zip\n",
      "\n",
      "nltk_data/corpora/comtrans.zip\n",
      "\n",
      "nltk_data/corpora/nonbreaking_prefixes.xml\n",
      "\n",
      "nltk_data/corpora/pl196x.xml\n",
      "\n",
      "nltk_data/corpora/bcp47.zip\n",
      "\n",
      "nltk_data/corpora/switchboard.zip\n",
      "\n",
      "nltk_data/corpora/pros_cons.zip\n",
      "\n",
      "nltk_data/corpora/wordnet_ic.zip\n",
      "\n",
      "nltk_data/corpora/udhr.xml\n",
      "\n",
      "nltk_data/corpora/dolch.xml\n",
      "\n",
      "nltk_data/corpora/shakespeare.xml\n",
      "\n",
      "nltk_data/corpora/verbnet.zip\n",
      "\n",
      "nltk_data/corpora/pe08.zip\n",
      "\n",
      "nltk_data/corpora/conll2007.xml\n",
      "\n",
      "nltk_data/corpora/brown.xml\n",
      "\n",
      "nltk_data/corpora/propbank.xml\n",
      "\n",
      "nltk_data/corpora/biocreative_ppi.zip\n",
      "\n",
      "nltk_data/corpora/product_reviews_2.xml\n",
      "\n",
      "nltk_data/corpora/senseval.zip\n",
      "\n",
      "nltk_data/corpora/unicode.notes\n",
      "\n",
      "nltk_data/corpora/gutenberg.zip\n",
      "\n",
      "nltk_data/corpora/sinica_treebank.zip\n",
      "\n",
      "nltk_data/corpora/state_union.xml\n",
      "\n",
      "nltk_data/corpora/product_reviews_2.zip\n",
      "\n",
      "nltk_data/corpora/ieer.xml\n",
      "\n",
      "nltk_data/corpora/paradigms.xml\n",
      "\n",
      "nltk_data/corpora/wordnet31.zip\n",
      "\n",
      "nltk_data/corpora/cess_esp.xml\n",
      "\n",
      "nltk_data/corpora/indian.zip\n",
      "\n",
      "nltk_data/corpora/conll2000.xml\n",
      "\n",
      "nltk_data/corpora/abc.xml\n",
      "\n",
      "nltk_data/corpora/timit.zip\n",
      "\n",
      "nltk_data/corpora/udhr2.xml\n",
      "\n",
      "nltk_data/corpora/floresta.xml\n",
      "\n",
      "nltk_data/corpora/inaugural.zip\n",
      "\n",
      "nltk_data/corpora/panlex_swadesh.zip\n",
      "\n",
      "nltk_data/corpora/gazetteers.xml\n",
      "\n",
      "nltk_data/corpora/product_reviews_1.xml\n",
      "\n",
      "nltk_data/corpora/movie_reviews.xml\n",
      "\n",
      "nltk_data/corpora/biocreative_ppi.xml\n",
      "\n",
      "nltk_data/corpora/webtext.xml\n",
      "\n",
      "nltk_data/corpora/udhr2.zip\n",
      "\n",
      "nltk_data/corpora/twitter_samples.zip\n",
      "\n",
      "nltk_data/corpora/comparative_sentences.xml\n",
      "\n",
      "nltk_data/corpora/knbc.xml\n",
      "\n",
      "nltk_data/corpora/cmudict.zip\n",
      "\n",
      "nltk_data/corpora/omw.xml\n",
      "\n",
      "nltk_data/corpora/masc_tagged.zip\n",
      "\n",
      "nltk_data/corpora/problem_reports.zip\n",
      "\n",
      "nltk_data/corpora/cess_cat.zip\n",
      "\n",
      "nltk_data/corpora/conll2000.zip\n",
      "\n",
      "nltk_data/corpora/swadesh.zip\n",
      "\n",
      "nltk_data/corpora/pil.xml\n",
      "\n",
      "nltk_data/corpora/verbnet.xml\n",
      "\n",
      "nltk_data/corpora/sinica_treebank.xml\n",
      "\n",
      "nltk_data/corpora/kimmo.zip\n",
      "\n",
      "nltk_data/corpora/conll2002.zip\n",
      "\n",
      "nltk_data/corpora/floresta.zip\n",
      "\n",
      "nltk_data/corpora/treebank.xml\n",
      "\n",
      "nltk_data/corpora/listing.csv\n",
      "\n",
      "nltk_data/corpora/nombank.1.0.zip\n",
      "\n",
      "nltk_data/corpora/listing.csv.zip\n",
      "\n",
      "nltk_data/corpora/words.zip\n",
      "\n",
      "nltk_data/corpora/webtext.zip\n",
      "\n",
      "nltk_data/corpora/crubadan.xml\n",
      "\n",
      "nltk_data/corpora/pros_cons.xml\n",
      "\n",
      "nltk_data/corpora/omw-1.4.xml\n",
      "\n",
      "nltk_data/corpora/brown_tei.zip\n",
      "\n",
      "nltk_data/corpora/jeita.xml\n",
      "\n",
      "nltk_data/corpora/ycoe.xml\n",
      "\n",
      "nltk_data/corpora/alpino.zip\n",
      "\n",
      "nltk_data/corpora/universal_treebanks_v20.xml\n",
      "\n",
      "nltk_data/corpora/gazetteers.zip\n",
      "\n",
      "nltk_data/corpora/europarl_raw.xml\n",
      "\n",
      "nltk_data/corpora/paradigms.zip\n",
      "\n",
      "nltk_data/corpora/qc.xml\n",
      "\n",
      "nltk_data/corpora/chat80.zip\n",
      "\n",
      "nltk_data/corpora/europarl_raw.zip\n",
      "\n",
      "nltk_data/corpora/mte_teip5.zip\n",
      "\n",
      "nltk_data/corpora/toolbox.xml\n",
      "\n",
      "nltk_data/corpora/brown.zip\n",
      "\n",
      "nltk_data/corpora/sentiwordnet.xml\n",
      "\n",
      "nltk_data/corpora/wordnet_ic.xml\n",
      "\n",
      "nltk_data/corpora/gutenberg.xml\n",
      "\n",
      "nltk_data/corpora/names.zip\n",
      "\n",
      "nltk_data/corpora/extended_omw.xml\n",
      "\n",
      "nltk_data/corpora/wordnet.xml\n",
      "\n",
      "nltk_data/corpora/inaugural.xml\n",
      "\n",
      "nltk_data/corpora/wordnet31.xml\n",
      "\n",
      "nltk_data/corpora/crubadan.zip\n",
      "\n",
      "nltk_data/corpora/wordnet2022.xml\n",
      "\n",
      "nltk_data/corpora/alpino.xml\n",
      "\n",
      "nltk_data/corpora/smultron.xml\n",
      "\n",
      "nltk_data/corpora/pe08.xml\n",
      "\n",
      "nltk_data/corpora/rte.xml\n",
      "\n",
      "nltk_data/corpora/dependency_treebank.xml\n",
      "\n",
      "nltk_data/stemmers/\n",
      "\n",
      "nltk_data/stemmers/snowball_data.xml\n",
      "\n",
      "nltk_data/stemmers/rslp.xml\n",
      "\n",
      "nltk_data/stemmers/snowball_data.zip\n",
      "\n",
      "nltk_data/stemmers/porter_test.xml\n",
      "\n",
      "nltk_data/stemmers/porter_test.zip\n",
      "\n",
      "nltk_data/stemmers/rslp.zip\n",
      "\n",
      "nltk_data/grammars/\n",
      "\n",
      "nltk_data/grammars/book_grammars.zip\n",
      "\n",
      "nltk_data/grammars/book_grammars.xml\n",
      "\n",
      "nltk_data/grammars/spanish_grammars.xml\n",
      "\n",
      "nltk_data/grammars/sample_grammars.xml\n",
      "\n",
      "nltk_data/grammars/large_grammars.xml\n",
      "\n",
      "nltk_data/grammars/large_grammars.zip\n",
      "\n",
      "nltk_data/grammars/basque_grammars.xml\n",
      "\n",
      "nltk_data/grammars/spanish_grammars.zip\n",
      "\n",
      "nltk_data/grammars/sample_grammars.zip\n",
      "\n",
      "nltk_data/grammars/basque_grammars.zip\n",
      "\n",
      "nltk_data/models/\n",
      "\n",
      "nltk_data/models/word2vec_sample.xml\n",
      "\n",
      "nltk_data/models/moses_sample.zip\n",
      "\n",
      "nltk_data/models/moses_sample.xml\n",
      "\n",
      "nltk_data/models/bllip_wsj_no_aux.xml\n",
      "\n",
      "nltk_data/models/word2vec_sample.zip\n",
      "\n",
      "nltk_data/models/bllip_wsj_no_aux.zip\n",
      "\n",
      "nltk_data/models/wmt15_eval.zip\n",
      "\n",
      "nltk_data/models/wmt15_eval.xml\n"
     ]
    }
   ],
   "source": [
    "os.chdir(work_dir)\n",
    "!pwd\n",
    "!tar -xvf nltk_data.tar"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "50ea0726-14f2-440f-a08f-e59451ed373d",
   "metadata": {},
   "source": [
    "下载text2vec-large-chinese模型，用于中文通用语义匹配"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "545bced4-7f04-4a86-aaaa-074e4c84b1e7",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import moxing as mox\n",
    "\n",
    "obs_path = 'obs://dtse-models/tar-models/text2vec-large-chinese.tar'\n",
    "ma_path = os.path.join(work_dir, 'text2vec-large-chinese.tar')\n",
    "mox.file.copy(obs_path, ma_path)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1e7dcb42-884c-44ca-8ece-0d6f3935133e",
   "metadata": {},
   "source": [
    "进入text2vec-large-chinese目录，解压模型压缩包"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "92f659f9-dadc-4bf6-9800-338f4a8ec578",
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/home/ma-user/work\n",
      "\n",
      "text2vec-large-chinese/\n",
      "\n",
      "text2vec-large-chinese/.gitattributes\n",
      "\n",
      "text2vec-large-chinese/README.md\n",
      "\n",
      "text2vec-large-chinese/config.json\n",
      "\n",
      "text2vec-large-chinese/eval_results.txt\n",
      "\n",
      "text2vec-large-chinese/models--GanymedeNil--text2vec-large-chinese/\n",
      "\n",
      "text2vec-large-chinese/models--GanymedeNil--text2vec-large-chinese/blobs/\n",
      "\n",
      "text2vec-large-chinese/models--GanymedeNil--text2vec-large-chinese/blobs/eaf5cb71c0eeab7db3c5171da504e5867b3f67a78e07bdba9b52d334ae35adb3.lock\n",
      "\n",
      "text2vec-large-chinese/models--GanymedeNil--text2vec-large-chinese/refs/\n",
      "\n",
      "text2vec-large-chinese/models--GanymedeNil--text2vec-large-chinese/refs/main\n",
      "\n",
      "text2vec-large-chinese/models--GanymedeNil--text2vec-large-chinese/snapshots/\n",
      "\n",
      "text2vec-large-chinese/models--GanymedeNil--text2vec-large-chinese/snapshots/064717f2acfd7253bea91079d59b82e50b58c886/\n",
      "\n",
      "text2vec-large-chinese/pytorch_model.bin\n",
      "\n",
      "text2vec-large-chinese/special_tokens_map.json\n",
      "\n",
      "text2vec-large-chinese/tmpqlu9nxcm\n",
      "\n",
      "text2vec-large-chinese/tokenizer.json\n",
      "\n",
      "text2vec-large-chinese/tokenizer_config.json\n",
      "\n",
      "text2vec-large-chinese/vocab.txt\n"
     ]
    }
   ],
   "source": [
    "os.chdir(work_dir)\n",
    "!pwd\n",
    "!tar -xvf text2vec-large-chinese.tar"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2a5bc17f-c539-45d9-b80c-d48ebcecea0b",
   "metadata": {},
   "source": [
    "### 2.2 配置环境\n",
    "\n",
    "本案例依赖Python3.10.10及以上环境，因此我们首先创建虚拟环境："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "86ab2994-2d9c-4404-a017-c5a3b8a8d671",
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/home/ma-user/anaconda3/lib/python3.7/site-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.26.12) or chardet (3.0.4) doesn't match a supported version!\n",
      "\n",
      "  RequestsDependencyWarning)\n",
      "\n",
      "Collecting package metadata (current_repodata.json): done\n",
      "\n",
      "Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.\n",
      "\n",
      "Collecting package metadata (repodata.json): done\n",
      "\n",
      "Solving environment: done\n",
      "\n",
      "\n",
      "\n",
      "## Package Plan ##\n",
      "\n",
      "\n",
      "\n",
      "  environment location: /home/ma-user/anaconda3/envs/python-3.10.10\n",
      "\n",
      "\n",
      "\n",
      "  added / updated specs:\n",
      "\n",
      "    - python=3.10.10\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "The following NEW packages will be INSTALLED:\n",
      "\n",
      "\n",
      "\n",
      "  _libgcc_mutex      anaconda/pkgs/main/linux-64::_libgcc_mutex-0.1-main\n",
      "\n",
      "  _openmp_mutex      anaconda/pkgs/main/linux-64::_openmp_mutex-5.1-1_gnu\n",
      "\n",
      "  bzip2              anaconda/pkgs/main/linux-64::bzip2-1.0.8-h7b6447c_0\n",
      "\n",
      "  ca-certificates    anaconda/pkgs/main/linux-64::ca-certificates-2023.08.22-h06a4308_0\n",
      "\n",
      "  ld_impl_linux-64   anaconda/pkgs/main/linux-64::ld_impl_linux-64-2.38-h1181459_1\n",
      "\n",
      "  libffi             anaconda/pkgs/main/linux-64::libffi-3.4.4-h6a678d5_0\n",
      "\n",
      "  libgcc-ng          anaconda/pkgs/main/linux-64::libgcc-ng-11.2.0-h1234567_1\n",
      "\n",
      "  libgomp            anaconda/pkgs/main/linux-64::libgomp-11.2.0-h1234567_1\n",
      "\n",
      "  libstdcxx-ng       anaconda/pkgs/main/linux-64::libstdcxx-ng-11.2.0-h1234567_1\n",
      "\n",
      "  libuuid            anaconda/pkgs/main/linux-64::libuuid-1.41.5-h5eee18b_0\n",
      "\n",
      "  ncurses            anaconda/pkgs/main/linux-64::ncurses-6.4-h6a678d5_0\n",
      "\n",
      "  openssl            anaconda/pkgs/main/linux-64::openssl-1.1.1w-h7f8727e_0\n",
      "\n",
      "  pip                anaconda/pkgs/main/linux-64::pip-23.3.1-py310h06a4308_0\n",
      "\n",
      "  python             anaconda/pkgs/main/linux-64::python-3.10.10-h7a1cb2a_2\n",
      "\n",
      "  readline           anaconda/pkgs/main/linux-64::readline-8.2-h5eee18b_0\n",
      "\n",
      "  setuptools         anaconda/pkgs/main/linux-64::setuptools-68.0.0-py310h06a4308_0\n",
      "\n",
      "  sqlite             anaconda/pkgs/main/linux-64::sqlite-3.41.2-h5eee18b_0\n",
      "\n",
      "  tk                 anaconda/pkgs/main/linux-64::tk-8.6.12-h1ccaba5_0\n",
      "\n",
      "  tzdata             anaconda/pkgs/main/noarch::tzdata-2023c-h04d1e81_0\n",
      "\n",
      "  wheel              anaconda/pkgs/main/linux-64::wheel-0.41.2-py310h06a4308_0\n",
      "\n",
      "  xz                 anaconda/pkgs/main/linux-64::xz-5.4.2-h5eee18b_0\n",
      "\n",
      "  zlib               anaconda/pkgs/main/linux-64::zlib-1.2.13-h5eee18b_0\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "Preparing transaction: done\n",
      "\n",
      "Verifying transaction: done\n",
      "\n",
      "Executing transaction: done\n",
      "\n",
      "#\n",
      "\n",
      "# To activate this environment, use\n",
      "\n",
      "#\n",
      "\n",
      "#     $ conda activate python-3.10.10\n",
      "\n",
      "#\n",
      "\n",
      "# To deactivate an active environment, use\n",
      "\n",
      "#\n",
      "\n",
      "#     $ conda deactivate\n",
      "\n",
      "\n",
      "\n",
      "Looking in indexes: http://repo.myhuaweicloud.com/repository/pypi/simple\n",
      "\n",
      "Collecting ipykernel\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/5d/4b/ffb537e392e730c9a5b02758f9c87077d9087bcb0d957853e13f121e5ea7/ipykernel-6.23.1-py3-none-any.whl (152 kB)\n",
      "\n",
      "Collecting comm>=0.1.1 (from ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/74/f3/b88d7e1dadf741550c56b70d7ce62673354fddb68e143d193ceb80224208/comm-0.1.3-py3-none-any.whl (6.6 kB)\n",
      "\n",
      "Collecting debugpy>=1.6.5 (from ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/1f/19/345c21f6b62acf556c39e4358a22b0ad868fecb462c1041c13513d229b33/debugpy-1.6.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)\n",
      "\n",
      "Collecting ipython>=7.23.1 (from ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/09/db/d641ee07f319002393524b6c5a8b47370520dcb2b6166a0972cfe9398c60/ipython-8.10.0-py3-none-any.whl (784 kB)\n",
      "\n",
      "Collecting jupyter-client>=6.1.12 (from ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/07/37/4019d2c41ca333c08dfdfeb84c0fc0368c8defbbd3c8f0c9a530851e5813/jupyter_client-8.2.0-py3-none-any.whl (103 kB)\n",
      "\n",
      "Collecting jupyter-core!=5.0.*,>=4.12 (from ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/41/1e/92a67f333b9335f04ce409799c030dcfb291712658b9d9d13997f7c91e5a/jupyter_core-5.3.0-py3-none-any.whl (93 kB)\n",
      "\n",
      "Collecting matplotlib-inline>=0.1 (from ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/a6/2d/2230afd570c70074e80fd06857ba2bdc5f10c055bd9125665fe276fadb67/matplotlib_inline-0.1.3-py3-none-any.whl (8.2 kB)\n",
      "\n",
      "Collecting nest-asyncio (from ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/06/e0/93453ebab12f5ce9a9ceda2ff71648b30e5f2ce5bba19ee3c95cbd0aaa67/nest_asyncio-1.5.4-py3-none-any.whl (5.1 kB)\n",
      "\n",
      "Collecting packaging (from ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/ec/1a/610693ac4ee14fcdf2d9bf3c493370e4f2ef7ae2e19217d7a237ff42367d/packaging-23.2-py3-none-any.whl (53 kB)\n",
      "\n",
      "Collecting psutil (from ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/6e/c8/784968329c1c67c28cce91991ef9af8a8913aa5a3399a6a8954b1380572f/psutil-5.9.4-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (280 kB)\n",
      "\n",
      "Collecting pyzmq>=20 (from ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/79/f4/c33ff6e3d7bfbceecbb2176f75328c897365a519f507d226e44eea74d6d2/pyzmq-25.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)\n",
      "\n",
      "Collecting tornado>=6.1 (from ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/66/a5/e6da56c03ff61200d5a43cfb75ab09316fc0836aa7ee26b4e9dcbfc3ae85/tornado-6.3.3-cp38-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (427 kB)\n",
      "\n",
      "Collecting traitlets>=5.4.0 (from ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/77/75/c28e9ef7abec2b7e9ff35aea3e0be6c1aceaf7873c26c95ae1f0d594de71/traitlets-5.9.0-py3-none-any.whl (117 kB)\n",
      "\n",
      "Collecting backcall (from ipython>=7.23.1->ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/4c/1c/ff6546b6c12603d8dd1070aa3c3d273ad4c07f5771689a7b69a550e8c951/backcall-0.2.0-py2.py3-none-any.whl (11 kB)\n",
      "\n",
      "Collecting decorator (from ipython>=7.23.1->ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/d5/50/83c593b07763e1161326b3b8c6686f0f4b0f24d5526546bee538c89837d6/decorator-5.1.1-py3-none-any.whl (9.1 kB)\n",
      "\n",
      "Collecting jedi>=0.16 (from ipython>=7.23.1->ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/6d/60/4acda63286ef6023515eb914543ba36496b8929cb7af49ecce63afde09c6/jedi-0.18.2-py2.py3-none-any.whl (1.6 MB)\n",
      "\n",
      "Collecting pickleshare (from ipython>=7.23.1->ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/9a/41/220f49aaea88bc6fa6cba8d05ecf24676326156c23b991e80b3f2fc24c77/pickleshare-0.7.5-py2.py3-none-any.whl (6.9 kB)\n",
      "\n",
      "Collecting prompt-toolkit<3.1.0,>=3.0.30 (from ipython>=7.23.1->ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/87/3f/1f5a0ff475ae6481f4b0d45d4d911824d3218b94ee2a97a8cb84e5569836/prompt_toolkit-3.0.38-py3-none-any.whl (385 kB)\n",
      "\n",
      "Collecting pygments>=2.4.0 (from ipython>=7.23.1->ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/0b/42/d9d95cc461f098f204cd20c85642ae40fbff81f74c300341b8d0e0df14e0/Pygments-2.14.0-py3-none-any.whl (1.1 MB)\n",
      "\n",
      "Collecting stack-data (from ipython>=7.23.1->ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/6a/81/aa96c25c27f78cdc444fec27d80f4c05194c591465e491a1358d8a035bc1/stack_data-0.6.2-py3-none-any.whl (24 kB)\n",
      "\n",
      "Collecting pexpect>4.3 (from ipython>=7.23.1->ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/39/7b/88dbb785881c28a102619d46423cb853b46dbccc70d3ac362d99773a78ce/pexpect-4.8.0-py2.py3-none-any.whl (59 kB)\n",
      "\n",
      "Collecting python-dateutil>=2.8.2 (from jupyter-client>=6.1.12->ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/36/7a/87837f39d0296e723bb9b62bbb257d0355c7f6128853c78955f57342a56d/python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)\n",
      "\n",
      "Collecting platformdirs>=2.5 (from jupyter-core!=5.0.*,>=4.12->ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/ba/24/a83a900a90105f8ad3f20df5bb5a2cde886df7125c7827e196e4ed4fa8a7/platformdirs-3.0.0-py3-none-any.whl (14 kB)\n",
      "\n",
      "Collecting parso<0.9.0,>=0.8.0 (from jedi>=0.16->ipython>=7.23.1->ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/05/63/8011bd08a4111858f79d2b09aad86638490d62fbf881c44e434a6dfca87b/parso-0.8.3-py2.py3-none-any.whl (100 kB)\n",
      "\n",
      "Collecting ptyprocess>=0.5 (from pexpect>4.3->ipython>=7.23.1->ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/22/a6/858897256d0deac81a172289110f31629fc4cee19b6f01283303e18c8db3/ptyprocess-0.7.0-py2.py3-none-any.whl (13 kB)\n",
      "\n",
      "Collecting wcwidth (from prompt-toolkit<3.1.0,>=3.0.30->ipython>=7.23.1->ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/59/7c/e39aca596badaf1b78e8f547c807b04dae603a433d3e7a7e04d67f2ef3e5/wcwidth-0.2.5-py2.py3-none-any.whl (30 kB)\n",
      "\n",
      "Collecting six>=1.5 (from python-dateutil>=2.8.2->jupyter-client>=6.1.12->ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/d9/5a/e7c31adbe875f2abbb91bd84cf2dc52d792b5a01506781dbcf25c91daf11/six-1.16.0-py2.py3-none-any.whl (11 kB)\n",
      "\n",
      "Collecting executing>=1.2.0 (from stack-data->ipython>=7.23.1->ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/28/3c/bc3819dd8b1a1588c9215a87271b6178cc5498acaa83885211f5d4d9e693/executing-1.2.0-py2.py3-none-any.whl (24 kB)\n",
      "\n",
      "Collecting asttokens>=2.1.0 (from stack-data->ipython>=7.23.1->ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/f3/e1/64679d9d0759db5b182222c81ff322c2fe2c31e156a59afd6e9208c960e5/asttokens-2.2.1-py2.py3-none-any.whl (26 kB)\n",
      "\n",
      "Collecting pure-eval (from stack-data->ipython>=7.23.1->ipykernel)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/2b/27/77f9d5684e6bce929f5cfe18d6cfbe5133013c06cb2fbf5933670e60761d/pure_eval-0.2.2-py3-none-any.whl (11 kB)\n",
      "\n",
      "Installing collected packages: wcwidth, pure-eval, ptyprocess, pickleshare, executing, backcall, traitlets, tornado, six, pyzmq, pygments, psutil, prompt-toolkit, platformdirs, pexpect, parso, packaging, nest-asyncio, decorator, debugpy, python-dateutil, matplotlib-inline, jupyter-core, jedi, comm, asttokens, stack-data, jupyter-client, ipython, ipykernel\n",
      "\n",
      "Successfully installed asttokens-2.2.1 backcall-0.2.0 comm-0.1.3 debugpy-1.6.6 decorator-5.1.1 executing-1.2.0 ipykernel-6.23.1 ipython-8.10.0 jedi-0.18.2 jupyter-client-8.2.0 jupyter-core-5.3.0 matplotlib-inline-0.1.3 nest-asyncio-1.5.4 packaging-23.2 parso-0.8.3 pexpect-4.8.0 pickleshare-0.7.5 platformdirs-3.0.0 prompt-toolkit-3.0.38 psutil-5.9.4 ptyprocess-0.7.0 pure-eval-0.2.2 pygments-2.14.0 python-dateutil-2.8.2 pyzmq-25.1.1 six-1.16.0 stack-data-0.6.2 tornado-6.3.3 traitlets-5.9.0 wcwidth-0.2.5\n"
     ]
    }
   ],
   "source": [
    "!/home/ma-user/anaconda3/bin/conda create -n python-3.10.10 python=3.10.10 -y --override-channels --channel https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main\n",
    "!/home/ma-user/anaconda3/envs/python-3.10.10/bin/pip install ipykernel"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "738f430c-401d-48d2-b330-2eca3a2a629c",
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "import os\n",
    "\n",
    "data = {\n",
    "   \"display_name\": \"python-3.10.10\",\n",
    "   \"env\": {\n",
    "      \"PATH\": \"/home/ma-user/anaconda3/envs/python-3.10.10/bin:/home/ma-user/anaconda3/envs/python-3.7.10/bin:/modelarts/authoring/notebook-conda/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/ma-user/modelarts/ma-cli/bin:/home/ma-user/modelarts/ma-cli/bin:/home/ma-user/anaconda3/envs/PyTorch-1.8/bin\"\n",
    "   },\n",
    "   \"language\": \"python\",\n",
    "   \"argv\": [\n",
    "      \"/home/ma-user/anaconda3/envs/python-3.10.10/bin/python\",\n",
    "      \"-m\",\n",
    "      \"ipykernel\",\n",
    "      \"-f\",\n",
    "      \"{connection_file}\"\n",
    "   ]\n",
    "}\n",
    "\n",
    "if not os.path.exists(\"/home/ma-user/anaconda3/share/jupyter/kernels/python-3.10.10/\"):\n",
    "    os.mkdir(\"/home/ma-user/anaconda3/share/jupyter/kernels/python-3.10.10/\")\n",
    "\n",
    "with open('/home/ma-user/anaconda3/share/jupyter/kernels/python-3.10.10/kernel.json', 'w') as f:\n",
    "    json.dump(data, f, indent=4)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "39d8b96b-98bc-489f-becc-998c27b994dc",
   "metadata": {},
   "source": [
    "创建完成后，稍等片刻，或刷新页面，点击右上角kernel选择python-3.10.10\n",
    "\n",
    "![](https://modelarts-labs-bj4-v2.obs.cn-north-4.myhuaweicloud.com/case_zoo/chatglm3/image/2.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "283d52cf-12c6-4330-ab15-40f3c0f2493e",
   "metadata": {},
   "source": [
    "### 2.3 安装依赖库"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "6840d348-a071-42d0-b96f-4e6ecff9547d",
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Looking in indexes: http://repo.myhuaweicloud.com/repository/pypi/simple\n",
      "\n",
      "Collecting transformers==4.30.2\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/5b/0b/e45d26ccd28568013523e04f325432ea88a442b4e3020b757cf4361f0120/transformers-4.30.2-py3-none-any.whl (7.2 MB)\n",
      "\n",
      "Collecting filelock (from transformers==4.30.2)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/00/45/ec3407adf6f6b5bf867a4462b2b0af27597a26bd3cd6e2534cb6ab029938/filelock-3.12.2-py3-none-any.whl (10 kB)\n",
      "\n",
      "Collecting huggingface-hub<1.0,>=0.14.1 (from transformers==4.30.2)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/7f/c4/adcbe9a696c135578cabcbdd7331332daad4d49b7c43688bc2d36b3a47d2/huggingface_hub-0.16.4-py3-none-any.whl (268 kB)\n",
      "\n",
      "Collecting numpy>=1.17 (from transformers==4.30.2)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/71/3c/3b1981c6a1986adc9ee7db760c0c34ea5b14ac3da9ecfcf1ea2a4ec6c398/numpy-1.25.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)\n",
      "\n",
      "Requirement already satisfied: packaging>=20.0 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from transformers==4.30.2) (23.2)\n",
      "\n",
      "Collecting pyyaml>=5.1 (from transformers==4.30.2)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/29/61/bf33c6c85c55bc45a29eee3195848ff2d518d84735eb0e2d8cb42e0d285e/PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (705 kB)\n",
      "\n",
      "Collecting regex!=2019.12.17 (from transformers==4.30.2)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/7c/81/b064cc2c67ca2182137641f9d3fd47fe470f1a84674d9b9f91fd39bf0e6f/regex-2023.5.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (769 kB)\n",
      "\n",
      "Collecting requests (from transformers==4.30.2)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/70/8e/0e2d847013cb52cd35b38c009bb167a1a26b2ce6cd6965bf26b47bc0bf44/requests-2.31.0-py3-none-any.whl (62 kB)\n",
      "\n",
      "Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers==4.30.2)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/94/60/ff26cce378023624ffcad91edaa4871f561d6ba7295185c45037ddba80e2/tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)\n",
      "\n",
      "Collecting safetensors>=0.3.1 (from transformers==4.30.2)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/6c/f0/c17bbdb1e5f9dab29d44cade445135789f75f8f08ea2728d04493ea8412b/safetensors-0.3.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)\n",
      "\n",
      "Collecting tqdm>=4.27 (from transformers==4.30.2)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/00/e5/f12a80907d0884e6dff9c16d0c0114d81b8cd07dc3ae54c5e962cc83037e/tqdm-4.66.1-py3-none-any.whl (78 kB)\n",
      "\n",
      "Collecting fsspec (from huggingface-hub<1.0,>=0.14.1->transformers==4.30.2)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/37/57/eb7c3c10b187d3b8565946772ce0229c79e3c623010eda0aeb5032ff56f4/fsspec-2022.11.0-py3-none-any.whl (139 kB)\n",
      "\n",
      "Collecting typing-extensions>=3.7.4.3 (from huggingface-hub<1.0,>=0.14.1->transformers==4.30.2)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/24/21/7d397a4b7934ff4028987914ac1044d3b7d52712f30e2ac7a2ae5bc86dd0/typing_extensions-4.8.0-py3-none-any.whl (31 kB)\n",
      "\n",
      "Collecting charset-normalizer<4,>=2 (from requests->transformers==4.30.2)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/da/f1/3702ba2a7470666a62fd81c58a4c40be00670e5006a67f4d626e57f013ae/charset_normalizer-3.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (142 kB)\n",
      "\n",
      "Collecting idna<4,>=2.5 (from requests->transformers==4.30.2)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/fc/34/3030de6f1370931b9dbb4dad48f6ab1015ab1d32447850b9fc94e60097be/idna-3.4-py3-none-any.whl (61 kB)\n",
      "\n",
      "Collecting urllib3<3,>=1.21.1 (from requests->transformers==4.30.2)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/96/94/c31f58c7a7f470d5665935262ebd7455c7e4c7782eb525658d3dbf4b9403/urllib3-2.1.0-py3-none-any.whl (104 kB)\n",
      "\n",
      "Collecting certifi>=2017.4.17 (from requests->transformers==4.30.2)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/64/62/428ef076be88fa93716b576e4a01f919d25968913e817077a386fcbe4f42/certifi-2023.11.17-py3-none-any.whl (162 kB)\n",
      "\n",
      "Installing collected packages: tokenizers, safetensors, urllib3, typing-extensions, tqdm, regex, pyyaml, numpy, idna, fsspec, filelock, charset-normalizer, certifi, requests, huggingface-hub, transformers\n",
      "\n",
      "Successfully installed certifi-2023.11.17 charset-normalizer-3.3.2 filelock-3.12.2 fsspec-2022.11.0 huggingface-hub-0.16.4 idna-3.4 numpy-1.25.2 pyyaml-6.0.1 regex-2023.5.5 requests-2.31.0 safetensors-0.3.3 tokenizers-0.13.3 tqdm-4.66.1 transformers-4.30.2 typing-extensions-4.8.0 urllib3-2.1.0\n",
      "\n",
      "Looking in indexes: http://repo.myhuaweicloud.com/repository/pypi/simple\n",
      "\n",
      "Collecting sentencepiece==0.1.99\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/7f/e5/323dc813b3e1339305f888d035e2f3725084fc4dcf051995b366dd26cc90/sentencepiece-0.1.99-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)\n",
      "\n",
      "Installing collected packages: sentencepiece\n",
      "\n",
      "Successfully installed sentencepiece-0.1.99\n",
      "\n",
      "Looking in indexes: http://repo.myhuaweicloud.com/repository/pypi/simple\n",
      "\n",
      "Collecting torch==2.0.1\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/8c/4d/17e07377c9c3d1a0c4eb3fde1c7c16b5a0ce6133ddbabc08ceef6b7f2645/torch-2.0.1-cp310-cp310-manylinux1_x86_64.whl (619.9 MB)\n",
      "\n",
      "Requirement already satisfied: filelock in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from torch==2.0.1) (3.12.2)\n",
      "\n",
      "Requirement already satisfied: typing-extensions in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from torch==2.0.1) (4.8.0)\n",
      "\n",
      "Collecting sympy (from torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/d2/05/e6600db80270777c4a64238a98d442f0fd07cc8915be2a1c16da7f2b9e74/sympy-1.12-py3-none-any.whl (5.7 MB)\n",
      "\n",
      "Collecting networkx (from torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/42/31/d2f89f1ae42718f8c8a9e440ebe38d7d5fe1e0d9eb9178ce779e365b3ab0/networkx-2.8.8-py3-none-any.whl (2.0 MB)\n",
      "\n",
      "Collecting jinja2 (from torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/bc/c3/f068337a370801f372f2f8f6bad74a5c140f6fda3d9de154052708dd3c65/Jinja2-3.1.2-py3-none-any.whl (133 kB)\n",
      "\n",
      "Collecting nvidia-cuda-nvrtc-cu11==11.7.99 (from torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/ea/8d/0709ba16c2831c17ec1c2ea1eeb89ada11ffa8d966d773cce0a7463b22bb/nvidia_cuda_nvrtc_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (21.0 MB)\n",
      "\n",
      "Collecting nvidia-cuda-runtime-cu11==11.7.99 (from torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/36/92/89cf558b514125d2ebd8344dd2f0533404b416486ff681d5434a5832a019/nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)\n",
      "\n",
      "Collecting nvidia-cuda-cupti-cu11==11.7.101 (from torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/e6/9d/dd0cdcd800e642e3c82ee3b5987c751afd4f3fb9cc2752517f42c3bc6e49/nvidia_cuda_cupti_cu11-11.7.101-py3-none-manylinux1_x86_64.whl (11.8 MB)\n",
      "\n",
      "Collecting nvidia-cudnn-cu11==8.5.0.96 (from torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/dc/30/66d4347d6e864334da5bb1c7571305e501dcb11b9155971421bb7bb5315f/nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)\n",
      "\n",
      "Collecting nvidia-cublas-cu11==11.10.3.66 (from torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/ce/41/fdeb62b5437996e841d83d7d2714ca75b886547ee8017ee2fe6ea409d983/nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)\n",
      "\n",
      "Collecting nvidia-cufft-cu11==10.9.0.58 (from torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/74/79/b912a77e38e41f15a0581a59f5c3548d1ddfdda3225936fb67c342719e7a/nvidia_cufft_cu11-10.9.0.58-py3-none-manylinux1_x86_64.whl (168.4 MB)\n",
      "\n",
      "Collecting nvidia-curand-cu11==10.2.10.91 (from torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/8f/11/af78d54b2420e64a4dd19e704f5bb69dcb5a6a3138b4465d6a48cdf59a21/nvidia_curand_cu11-10.2.10.91-py3-none-manylinux1_x86_64.whl (54.6 MB)\n",
      "\n",
      "Collecting nvidia-cusolver-cu11==11.4.0.1 (from torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/25/4b/272f9aa7838e545b47878e4aec4f09b0fecf17dbd312cf5c5dc398b0637f/nvidia_cusolver_cu11-11.4.0.1-py3-none-manylinux1_x86_64.whl (102.6 MB)\n",
      "\n",
      "Collecting nvidia-cusparse-cu11==11.7.4.91 (from torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/ea/6f/6d032cc1bb7db88a989ddce3f4968419a7edeafda362847f42f614b1f845/nvidia_cusparse_cu11-11.7.4.91-py3-none-manylinux1_x86_64.whl (173.2 MB)\n",
      "\n",
      "Collecting nvidia-nccl-cu11==2.14.3 (from torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/55/92/914cdb650b6a5d1478f83148597a25e90ea37d739bd563c5096b0e8a5f43/nvidia_nccl_cu11-2.14.3-py3-none-manylinux1_x86_64.whl (177.1 MB)\n",
      "\n",
      "Collecting nvidia-nvtx-cu11==11.7.91 (from torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/23/d5/09493ff0e64fd77523afbbb075108f27a13790479efe86b9ffb4587671b5/nvidia_nvtx_cu11-11.7.91-py3-none-manylinux1_x86_64.whl (98 kB)\n",
      "\n",
      "Collecting triton==2.0.0 (from torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/ca/31/ff6be541195daf77aa5c72303b2354661a69e717967d44d91eb4f3fdce32/triton-2.0.0-1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (63.3 MB)\n",
      "\n",
      "Requirement already satisfied: setuptools in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch==2.0.1) (68.0.0)\n",
      "\n",
      "Requirement already satisfied: wheel in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch==2.0.1) (0.41.2)\n",
      "\n",
      "Collecting cmake (from triton==2.0.0->torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/2e/51/3a4672a819b4532a378bfefad8f886cfe71057556e0d4eefb64523fd370a/cmake-3.27.2-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (26.1 MB)\n",
      "\n",
      "Collecting lit (from triton==2.0.0->torch==2.0.1)\n",
      "\n",
      "  Using cached lit-16.0.5-py3-none-any.whl\n",
      "\n",
      "Collecting MarkupSafe>=2.0 (from jinja2->torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/12/b3/d9ed2c0971e1435b8a62354b18d3060b66c8cb1d368399ec0b9baa7c0ee5/MarkupSafe-2.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)\n",
      "\n",
      "Collecting mpmath>=0.19 (from sympy->torch==2.0.1)\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/43/e3/7d92a15f894aa0c9c4b49b8ee9ac9850d6e63b03c9c32c0367a13ae62209/mpmath-1.3.0-py3-none-any.whl (536 kB)\n",
      "\n",
      "Installing collected packages: mpmath, lit, cmake, sympy, nvidia-nvtx-cu11, nvidia-nccl-cu11, nvidia-cusparse-cu11, nvidia-cusolver-cu11, nvidia-curand-cu11, nvidia-cufft-cu11, nvidia-cuda-runtime-cu11, nvidia-cuda-nvrtc-cu11, nvidia-cuda-cupti-cu11, nvidia-cublas-cu11, networkx, MarkupSafe, nvidia-cudnn-cu11, jinja2, triton, torch\n",
      "\n",
      "Successfully installed MarkupSafe-2.1.3 cmake-3.27.2 jinja2-3.1.2 lit-16.0.5 mpmath-1.3.0 networkx-2.8.8 nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-cupti-cu11-11.7.101 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 nvidia-cufft-cu11-10.9.0.58 nvidia-curand-cu11-10.2.10.91 nvidia-cusolver-cu11-11.4.0.1 nvidia-cusparse-cu11-11.7.4.91 nvidia-nccl-cu11-2.14.3 nvidia-nvtx-cu11-11.7.91 sympy-1.12 torch-2.0.1 triton-2.0.0\n",
      "\n",
      "Looking in indexes: http://repo.myhuaweicloud.com/repository/pypi/simple\n",
      "\n",
      "Collecting markdown==3.4.3\n",
      "\n",
      "  Using cached http://repo.myhuaweicloud.com/repository/pypi/packages/9a/a1/1352b0e5a3c71a79fa9265726e2217f69df9fd4de0bcb5725cc61f62a5df/Markdown-3.4.3-py3-none-any.whl (93 kB)\n",
      "\n",
      "Installing collected packages: markdown\n",
      "\n",
      "Successfully installed markdown-3.4.3\n",
      "\n",
      "Looking in indexes: http://repo.myhuaweicloud.com/repository/pypi/simple\n",
      "\n",
      "Collecting faiss-gpu==1.7.2\n",
      "\n",
      "  Downloading http://repo.myhuaweicloud.com/repository/pypi/packages/a8/71/623896382d90a9a99adf3438aa2c575535ba37804be9701d66f3337afd83/faiss_gpu-1.7.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (85.5 MB)\n",
      "\n",
      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m85.5/85.5 MB\u001b[0m \u001b[31m42.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m00:01\u001b[0m00:01\u001b[0m\n",
      "\n",
      "\u001b[?25hInstalling collected packages: faiss-gpu\n",
      "\n",
      "Successfully installed faiss-gpu-1.7.2\n",
      "\n",
      "Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple\n",
      "\n",
      "Collecting langchain==0.0.329\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/42/4e/86204994aeb2e4ac367a7fade896b13532eae2430299052eb2c80ca35d2c/langchain-0.0.329-py3-none-any.whl (2.0 MB)\n",
      "\n",
      "Requirement already satisfied: PyYAML>=5.3 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from langchain==0.0.329) (6.0.1)\n",
      "\n",
      "Collecting SQLAlchemy<3,>=1.4 (from langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/aa/1c/0b66318368b1c9ef51c5c8560530b8ef842164e10eea08cacb06739388e0/SQLAlchemy-2.0.23-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)\n",
      "\n",
      "Collecting aiohttp<4.0.0,>=3.8.3 (from langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b0/36/c7bd200871e7351ab8396e8edcbb91e1198e0ded67a0824c93110c4c5df2/aiohttp-3.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)\n",
      "\n",
      "Collecting anyio<4.0 (from langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/19/24/44299477fe7dcc9cb58d0a57d5a7588d6af2ff403fdd2d47a246c91a3246/anyio-3.7.1-py3-none-any.whl (80 kB)\n",
      "\n",
      "Collecting async-timeout<5.0.0,>=4.0.0 (from langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/a7/fa/e01228c2938de91d47b307831c62ab9e4001e747789d0b05baf779a6488c/async_timeout-4.0.3-py3-none-any.whl (5.7 kB)\n",
      "\n",
      "Collecting dataclasses-json<0.7,>=0.5.7 (from langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/8d/e2/528c52001a743a7faa28e6d3095d9f01b472d3efee62d62101403bf1a70a/dataclasses_json-0.6.2-py3-none-any.whl (28 kB)\n",
      "\n",
      "Collecting jsonpatch<2.0,>=1.33 (from langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/73/07/02e16ed01e04a374e644b575638ec7987ae846d25ad97bcc9945a3ee4b0e/jsonpatch-1.33-py2.py3-none-any.whl (12 kB)\n",
      "\n",
      "Collecting langsmith<0.1.0,>=0.0.52 (from langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/84/9e/208314830d8c523dae4dec41ab5aeeb2d42dc1667bbc3ff8b875244b3012/langsmith-0.0.66-py3-none-any.whl (46 kB)\n",
      "\n",
      "Requirement already satisfied: numpy<2,>=1 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from langchain==0.0.329) (1.25.2)\n",
      "\n",
      "Collecting pydantic<3,>=1 (from langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/0a/2b/64066de1c4cf3d4ed623beeb3bbf3f8d0cc26661f1e7d180ec5eb66b75a5/pydantic-2.5.2-py3-none-any.whl (381 kB)\n",
      "\n",
      "Requirement already satisfied: requests<3,>=2 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from langchain==0.0.329) (2.31.0)\n",
      "\n",
      "Collecting tenacity<9.0.0,>=8.1.0 (from langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/f4/f1/990741d5bb2487d529d20a433210ffa136a367751e454214013b441c4575/tenacity-8.2.3-py3-none-any.whl (24 kB)\n",
      "\n",
      "Collecting attrs>=17.3.0 (from aiohttp<4.0.0,>=3.8.3->langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/f0/eb/fcb708c7bf5056045e9e98f62b93bd7467eb718b0202e7698eb11d66416c/attrs-23.1.0-py3-none-any.whl (61 kB)\n",
      "\n",
      "Collecting multidict<7.0,>=4.5 (from aiohttp<4.0.0,>=3.8.3->langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/56/b5/ac112889bfc68e6cf4eda1e4325789b166c51c6cd29d5633e28fb2c2f966/multidict-6.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (114 kB)\n",
      "\n",
      "Collecting yarl<2.0,>=1.0 (from aiohttp<4.0.0,>=3.8.3->langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b6/b2/44b31699e27f82c577143d062a2b58cbe0c6e7a0828d13c0ffd10891ad40/yarl-1.9.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (300 kB)\n",
      "\n",
      "Collecting frozenlist>=1.1.1 (from aiohttp<4.0.0,>=3.8.3->langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/1e/28/74b8b6451c89c070d34e753d8b65a1e4ce508a6808b18529f36e8c0e2184/frozenlist-1.4.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (225 kB)\n",
      "\n",
      "Collecting aiosignal>=1.1.2 (from aiohttp<4.0.0,>=3.8.3->langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/76/ac/a7305707cb852b7e16ff80eaf5692309bde30e2b1100a1fcacdc8f731d97/aiosignal-1.3.1-py3-none-any.whl (7.6 kB)\n",
      "\n",
      "Requirement already satisfied: idna>=2.8 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from anyio<4.0->langchain==0.0.329) (3.4)\n",
      "\n",
      "Collecting sniffio>=1.1 (from anyio<4.0->langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/c3/a0/5dba8ed157b0136607c7f2151db695885606968d1fae123dc3391e0cfdbf/sniffio-1.3.0-py3-none-any.whl (10 kB)\n",
      "\n",
      "Collecting exceptiongroup (from anyio<4.0->langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b8/9a/5028fd52db10e600f1c4674441b968cf2ea4959085bfb5b99fb1250e5f68/exceptiongroup-1.2.0-py3-none-any.whl (16 kB)\n",
      "\n",
      "Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/ed/3c/cebfdcad015240014ff08b883d1c0c427f2ba45ae8c6572851b6ef136cad/marshmallow-3.20.1-py3-none-any.whl (49 kB)\n",
      "\n",
      "Collecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/65/f3/107a22063bf27bdccf2024833d3445f4eea42b2e598abfbd46f6a63b6cb0/typing_inspect-0.9.0-py3-none-any.whl (8.8 kB)\n",
      "\n",
      "Collecting jsonpointer>=1.9 (from jsonpatch<2.0,>=1.33->langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/12/f6/0232cc0c617e195f06f810534d00b74d2f348fe71b2118009ad8ad31f878/jsonpointer-2.4-py2.py3-none-any.whl (7.8 kB)\n",
      "\n",
      "Collecting annotated-types>=0.4.0 (from pydantic<3,>=1->langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/28/78/d31230046e58c207284c6b2c4e8d96e6d3cb4e52354721b944d3e1ee4aa5/annotated_types-0.6.0-py3-none-any.whl (12 kB)\n",
      "\n",
      "Collecting pydantic-core==2.14.5 (from pydantic<3,>=1->langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/7c/f5/3e59681bd53955da311a7f4efbb6315d01006e9d18b8a06b527a22d3d923/pydantic_core-2.14.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.1 MB)\n",
      "\n",
      "Requirement already satisfied: typing-extensions>=4.6.1 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from pydantic<3,>=1->langchain==0.0.329) (4.8.0)\n",
      "\n",
      "Requirement already satisfied: charset-normalizer<4,>=2 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from requests<3,>=2->langchain==0.0.329) (3.3.2)\n",
      "\n",
      "Requirement already satisfied: urllib3<3,>=1.21.1 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from requests<3,>=2->langchain==0.0.329) (2.1.0)\n",
      "\n",
      "Requirement already satisfied: certifi>=2017.4.17 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from requests<3,>=2->langchain==0.0.329) (2023.11.17)\n",
      "\n",
      "Collecting greenlet!=0.4.17 (from SQLAlchemy<3,>=1.4->langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/da/ab/7cc6502628565d70dce2edb619d87554d65ac4e2f17c805a5a45bfaefa5c/greenlet-3.0.1-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (613 kB)\n",
      "\n",
      "Requirement already satisfied: packaging>=17.0 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from marshmallow<4.0.0,>=3.18.0->dataclasses-json<0.7,>=0.5.7->langchain==0.0.329) (23.2)\n",
      "\n",
      "Collecting mypy-extensions>=0.3.0 (from typing-inspect<1,>=0.4.0->dataclasses-json<0.7,>=0.5.7->langchain==0.0.329)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/2a/e2/5d3f6ada4297caebe1a2add3b126fe800c96f56dbe5d1988a2cbe0b267aa/mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB)\n",
      "\n",
      "Installing collected packages: tenacity, sniffio, pydantic-core, mypy-extensions, multidict, marshmallow, jsonpointer, greenlet, frozenlist, exceptiongroup, attrs, async-timeout, annotated-types, yarl, typing-inspect, SQLAlchemy, pydantic, jsonpatch, anyio, aiosignal, langsmith, dataclasses-json, aiohttp, langchain\n",
      "\n",
      "Successfully installed SQLAlchemy-2.0.23 aiohttp-3.9.0 aiosignal-1.3.1 annotated-types-0.6.0 anyio-3.7.1 async-timeout-4.0.3 attrs-23.1.0 dataclasses-json-0.6.2 exceptiongroup-1.2.0 frozenlist-1.4.0 greenlet-3.0.1 jsonpatch-1.33 jsonpointer-2.4 langchain-0.0.329 langsmith-0.0.66 marshmallow-3.20.1 multidict-6.0.4 mypy-extensions-1.0.0 pydantic-2.5.2 pydantic-core-2.14.5 sniffio-1.3.0 tenacity-8.2.3 typing-inspect-0.9.0 yarl-1.9.3\n",
      "\n",
      "Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple\n",
      "\n",
      "Collecting nltk==3.8.1\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/a6/0a/0d20d2c0f16be91b9fa32a77b76c60f9baf6eba419e5ef5deca17af9c582/nltk-3.8.1-py3-none-any.whl (1.5 MB)\n",
      "\n",
      "Collecting click (from nltk==3.8.1)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/00/2e/d53fa4befbf2cfa713304affc7ca780ce4fc1fd8710527771b58311a3229/click-8.1.7-py3-none-any.whl (97 kB)\n",
      "\n",
      "Collecting joblib (from nltk==3.8.1)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/10/40/d551139c85db202f1f384ba8bcf96aca2f329440a844f924c8a0040b6d02/joblib-1.3.2-py3-none-any.whl (302 kB)\n",
      "\n",
      "Requirement already satisfied: regex>=2021.8.3 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from nltk==3.8.1) (2023.5.5)\n",
      "\n",
      "Requirement already satisfied: tqdm in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from nltk==3.8.1) (4.66.1)\n",
      "\n",
      "Installing collected packages: joblib, click, nltk\n",
      "\n",
      "Successfully installed click-8.1.7 joblib-1.3.2 nltk-3.8.1\n",
      "\n",
      "Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple\n",
      "\n",
      "Collecting unstructured==0.10.24\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/5d/10/f36ec76d07acee92ea7432c00e41d15ebf47df1db994778a1710da77e4a2/unstructured-0.10.24-py3-none-any.whl (1.7 MB)\n",
      "\n",
      "Collecting chardet (from unstructured==0.10.24)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/38/6f/f5fbc992a329ee4e0f288c1fe0e2ad9485ed064cac731ed2fe47dcc38cbf/chardet-5.2.0-py3-none-any.whl (199 kB)\n",
      "\n",
      "Collecting filetype (from unstructured==0.10.24)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/18/79/1b8fa1bb3568781e84c9200f951c735f3f157429f44be0495da55894d620/filetype-1.2.0-py2.py3-none-any.whl (19 kB)\n",
      "\n",
      "Collecting python-magic (from unstructured==0.10.24)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/6c/73/9f872cb81fc5c3bb48f7227872c28975f998f3e7c2b1c16e95e6432bbb90/python_magic-0.4.27-py2.py3-none-any.whl (13 kB)\n",
      "\n",
      "Collecting lxml (from unstructured==0.10.24)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/01/ae/ce23856fb6065f254101c1df381050b13adf26088dd554a15776615d470f/lxml-4.9.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (7.0 MB)\n",
      "\n",
      "Requirement already satisfied: nltk in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from unstructured==0.10.24) (3.8.1)\n",
      "\n",
      "Collecting tabulate (from unstructured==0.10.24)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/40/44/4a5f08c96eb108af5cb50b41f76142f0afa346dfa99d5296fe7202a11854/tabulate-0.9.0-py3-none-any.whl (35 kB)\n",
      "\n",
      "Requirement already satisfied: requests in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from unstructured==0.10.24) (2.31.0)\n",
      "\n",
      "Collecting beautifulsoup4 (from unstructured==0.10.24)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/57/f4/a69c20ee4f660081a7dedb1ac57f29be9378e04edfcb90c526b923d4bebc/beautifulsoup4-4.12.2-py3-none-any.whl (142 kB)\n",
      "\n",
      "Collecting emoji (from unstructured==0.10.24)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/96/c6/0114b2040a96561fd1b44c75df749bbd3c898bf8047fb5ce8d7590d2dee6/emoji-2.8.0-py2.py3-none-any.whl (358 kB)\n",
      "\n",
      "Requirement already satisfied: dataclasses-json in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from unstructured==0.10.24) (0.6.2)\n",
      "\n",
      "Collecting python-iso639 (from unstructured==0.10.24)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b8/6d/5d1f7e5c1b0c58b700eb67dbb570f9381afc90bc0535686a89e90eac5dfb/python_iso639-2023.6.15-py3-none-any.whl (275 kB)\n",
      "\n",
      "Collecting langdetect (from unstructured==0.10.24)\n",
      "\n",
      "  Using cached langdetect-1.0.9-py3-none-any.whl\n",
      "\n",
      "Requirement already satisfied: numpy in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from unstructured==0.10.24) (1.25.2)\n",
      "\n",
      "Collecting rapidfuzz (from unstructured==0.10.24)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/58/69/55f7a10f5760d8539081718f1a87172cd5b0ea21f6c1232c40a6b7cf9470/rapidfuzz-3.5.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.3 MB)\n",
      "\n",
      "Collecting backoff (from unstructured==0.10.24)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/df/73/b6e24bd22e6720ca8ee9a85a0c4a2971af8497d8f3193fa05390cbd46e09/backoff-2.2.1-py3-none-any.whl (15 kB)\n",
      "\n",
      "Collecting soupsieve>1.2 (from beautifulsoup4->unstructured==0.10.24)\n",
      "\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/4c/f3/038b302fdfbe3be7da016777069f26ceefe11a681055ea1f7817546508e3/soupsieve-2.5-py3-none-any.whl (36 kB)\n",
      "\n",
      "Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from dataclasses-json->unstructured==0.10.24) (3.20.1)\n",
      "\n",
      "Requirement already satisfied: typing-inspect<1,>=0.4.0 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from dataclasses-json->unstructured==0.10.24) (0.9.0)\n",
      "\n",
      "Requirement already satisfied: six in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from langdetect->unstructured==0.10.24) (1.16.0)\n",
      "\n",
      "Requirement already satisfied: click in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from nltk->unstructured==0.10.24) (8.1.7)\n",
      "\n",
      "Requirement already satisfied: joblib in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from nltk->unstructured==0.10.24) (1.3.2)\n",
      "\n",
      "Requirement already satisfied: regex>=2021.8.3 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from nltk->unstructured==0.10.24) (2023.5.5)\n",
      "\n",
      "Requirement already satisfied: tqdm in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from nltk->unstructured==0.10.24) (4.66.1)\n",
      "\n",
      "Requirement already satisfied: charset-normalizer<4,>=2 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from requests->unstructured==0.10.24) (3.3.2)\n",
      "\n",
      "Requirement already satisfied: idna<4,>=2.5 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from requests->unstructured==0.10.24) (3.4)\n",
      "\n",
      "Requirement already satisfied: urllib3<3,>=1.21.1 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from requests->unstructured==0.10.24) (2.1.0)\n",
      "\n",
      "Requirement already satisfied: certifi>=2017.4.17 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from requests->unstructured==0.10.24) (2023.11.17)\n",
      "\n",
      "Requirement already satisfied: packaging>=17.0 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from marshmallow<4.0.0,>=3.18.0->dataclasses-json->unstructured==0.10.24) (23.2)\n",
      "\n",
      "Requirement already satisfied: mypy-extensions>=0.3.0 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from typing-inspect<1,>=0.4.0->dataclasses-json->unstructured==0.10.24) (1.0.0)\n",
      "\n",
      "Requirement already satisfied: typing-extensions>=3.7.4 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from typing-inspect<1,>=0.4.0->dataclasses-json->unstructured==0.10.24) (4.8.0)\n",
      "\n",
      "Installing collected packages: filetype, tabulate, soupsieve, rapidfuzz, python-magic, python-iso639, lxml, langdetect, emoji, chardet, backoff, beautifulsoup4, unstructured\n",
      "\n",
      "Successfully installed backoff-2.2.1 beautifulsoup4-4.12.2 chardet-5.2.0 emoji-2.8.0 filetype-1.2.0 langdetect-1.0.9 lxml-4.9.3 python-iso639-2023.6.15 python-magic-0.4.27 rapidfuzz-3.5.2 soupsieve-2.5 tabulate-0.9.0 unstructured-0.10.24\n",
      "\n",
      "Looking in indexes: http://repo.myhuaweicloud.com/repository/pypi/simple\n",
      "\n",
      "Collecting sentence-transformers==2.2.2\n",
      "\n",
      "  Downloading http://repo.myhuaweicloud.com/repository/pypi/packages/20/9c/f07bd70d128fdb107bc02a0c702b9058b4fe147d0ba67b5a0f4c3cf15a54/sentence-transformers-2.2.2.tar.gz (85 kB)\n",
      "\n",
      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m86.0/86.0 kB\u001b[0m \u001b[31m10.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
      "\n",
      "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25ldone\n",
      "\n",
      "\u001b[?25hRequirement already satisfied: transformers<5.0.0,>=4.6.0 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from sentence-transformers==2.2.2) (4.30.2)\n",
      "\n",
      "Requirement already satisfied: tqdm in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from sentence-transformers==2.2.2) (4.66.1)\n",
      "\n",
      "Requirement already satisfied: torch>=1.6.0 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from sentence-transformers==2.2.2) (2.0.1)\n",
      "\n",
      "Collecting torchvision (from sentence-transformers==2.2.2)\n",
      "\n",
      "  Downloading http://repo.myhuaweicloud.com/repository/pypi/packages/87/0f/88f023bf6176d9af0f85feedf4be129f9cf2748801c4d9c690739a10c100/torchvision-0.15.2-cp310-cp310-manylinux1_x86_64.whl (6.0 MB)\n",
      "\n",
      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m6.0/6.0 MB\u001b[0m \u001b[31m90.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0mta \u001b[36m0:00:01\u001b[0m\n",
      "\n",
      "\u001b[?25hRequirement already satisfied: numpy in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from sentence-transformers==2.2.2) (1.25.2)\n",
      "\n",
      "Collecting scikit-learn (from sentence-transformers==2.2.2)\n",
      "\n",
      "  Downloading http://repo.myhuaweicloud.com/repository/pypi/packages/5c/e9/ee572691a3fb05555bcde41826faad29ae4bc1fb07982e7f53d54a176879/scikit_learn-1.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (10.8 MB)\n",
      "\n",
      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m10.8/10.8 MB\u001b[0m \u001b[31m60.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m00:01\u001b[0m00:01\u001b[0m\n",
      "\n",
      "\u001b[?25hCollecting scipy (from sentence-transformers==2.2.2)\n",
      "\n",
      "  Downloading http://repo.myhuaweicloud.com/repository/pypi/packages/a8/cc/c36f3439f5d47c3b13833ce6687b43a040cc7638c502ac46b41e2d4f3d6f/scipy-1.11.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (36.3 MB)\n",
      "\n",
      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m36.3/36.3 MB\u001b[0m \u001b[31m59.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m00:01\u001b[0m00:01\u001b[0m\n",
      "\n",
      "\u001b[?25hRequirement already satisfied: nltk in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from sentence-transformers==2.2.2) (3.8.1)\n",
      "\n",
      "Requirement already satisfied: sentencepiece in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from sentence-transformers==2.2.2) (0.1.99)\n",
      "\n",
      "Requirement already satisfied: huggingface-hub>=0.4.0 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from sentence-transformers==2.2.2) (0.16.4)\n",
      "\n",
      "Requirement already satisfied: filelock in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from huggingface-hub>=0.4.0->sentence-transformers==2.2.2) (3.12.2)\n",
      "\n",
      "Requirement already satisfied: fsspec in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from huggingface-hub>=0.4.0->sentence-transformers==2.2.2) (2022.11.0)\n",
      "\n",
      "Requirement already satisfied: requests in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from huggingface-hub>=0.4.0->sentence-transformers==2.2.2) (2.31.0)\n",
      "\n",
      "Requirement already satisfied: pyyaml>=5.1 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from huggingface-hub>=0.4.0->sentence-transformers==2.2.2) (6.0.1)\n",
      "\n",
      "Requirement already satisfied: typing-extensions>=3.7.4.3 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from huggingface-hub>=0.4.0->sentence-transformers==2.2.2) (4.8.0)\n",
      "\n",
      "Requirement already satisfied: packaging>=20.9 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from huggingface-hub>=0.4.0->sentence-transformers==2.2.2) (23.2)\n",
      "\n",
      "Requirement already satisfied: sympy in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from torch>=1.6.0->sentence-transformers==2.2.2) (1.12)\n",
      "\n",
      "Requirement already satisfied: networkx in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from torch>=1.6.0->sentence-transformers==2.2.2) (2.8.8)\n",
      "\n",
      "Requirement already satisfied: jinja2 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from torch>=1.6.0->sentence-transformers==2.2.2) (3.1.2)\n",
      "\n",
      "Requirement already satisfied: nvidia-cuda-nvrtc-cu11==11.7.99 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from torch>=1.6.0->sentence-transformers==2.2.2) (11.7.99)\n",
      "\n",
      "Requirement already satisfied: nvidia-cuda-runtime-cu11==11.7.99 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from torch>=1.6.0->sentence-transformers==2.2.2) (11.7.99)\n",
      "\n",
      "Requirement already satisfied: nvidia-cuda-cupti-cu11==11.7.101 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from torch>=1.6.0->sentence-transformers==2.2.2) (11.7.101)\n",
      "\n",
      "Requirement already satisfied: nvidia-cudnn-cu11==8.5.0.96 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from torch>=1.6.0->sentence-transformers==2.2.2) (8.5.0.96)\n",
      "\n",
      "Requirement already satisfied: nvidia-cublas-cu11==11.10.3.66 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from torch>=1.6.0->sentence-transformers==2.2.2) (11.10.3.66)\n",
      "\n",
      "Requirement already satisfied: nvidia-cufft-cu11==10.9.0.58 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from torch>=1.6.0->sentence-transformers==2.2.2) (10.9.0.58)\n",
      "\n",
      "Requirement already satisfied: nvidia-curand-cu11==10.2.10.91 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from torch>=1.6.0->sentence-transformers==2.2.2) (10.2.10.91)\n",
      "\n",
      "Requirement already satisfied: nvidia-cusolver-cu11==11.4.0.1 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from torch>=1.6.0->sentence-transformers==2.2.2) (11.4.0.1)\n",
      "\n",
      "Requirement already satisfied: nvidia-cusparse-cu11==11.7.4.91 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from torch>=1.6.0->sentence-transformers==2.2.2) (11.7.4.91)\n",
      "\n",
      "Requirement already satisfied: nvidia-nccl-cu11==2.14.3 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from torch>=1.6.0->sentence-transformers==2.2.2) (2.14.3)\n",
      "\n",
      "Requirement already satisfied: nvidia-nvtx-cu11==11.7.91 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from torch>=1.6.0->sentence-transformers==2.2.2) (11.7.91)\n",
      "\n",
      "Requirement already satisfied: triton==2.0.0 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from torch>=1.6.0->sentence-transformers==2.2.2) (2.0.0)\n",
      "\n",
      "Requirement already satisfied: setuptools in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=1.6.0->sentence-transformers==2.2.2) (68.0.0)\n",
      "\n",
      "Requirement already satisfied: wheel in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=1.6.0->sentence-transformers==2.2.2) (0.41.2)\n",
      "\n",
      "Requirement already satisfied: cmake in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from triton==2.0.0->torch>=1.6.0->sentence-transformers==2.2.2) (3.27.2)\n",
      "\n",
      "Requirement already satisfied: lit in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from triton==2.0.0->torch>=1.6.0->sentence-transformers==2.2.2) (16.0.5)\n",
      "\n",
      "Requirement already satisfied: regex!=2019.12.17 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers==2.2.2) (2023.5.5)\n",
      "\n",
      "Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers==2.2.2) (0.13.3)\n",
      "\n",
      "Requirement already satisfied: safetensors>=0.3.1 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers==2.2.2) (0.3.3)\n",
      "\n",
      "Requirement already satisfied: click in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from nltk->sentence-transformers==2.2.2) (8.1.7)\n",
      "\n",
      "Requirement already satisfied: joblib in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from nltk->sentence-transformers==2.2.2) (1.3.2)\n",
      "\n",
      "Collecting threadpoolctl>=2.0.0 (from scikit-learn->sentence-transformers==2.2.2)\n",
      "\n",
      "  Downloading http://repo.myhuaweicloud.com/repository/pypi/packages/61/cf/6e354304bcb9c6413c4e02a747b600061c21d38ba51e7e544ac7bc66aecc/threadpoolctl-3.1.0-py3-none-any.whl (14 kB)\n",
      "\n",
      "Collecting pillow!=8.3.*,>=5.3.0 (from torchvision->sentence-transformers==2.2.2)\n",
      "\n",
      "  Downloading http://repo.myhuaweicloud.com/repository/pypi/packages/95/7b/71e2665760b5c33af00fa9bb6d6bca068b51bf021a4ceaeee03e18689f51/Pillow-10.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.5 MB)\n",
      "\n",
      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.5/3.5 MB\u001b[0m \u001b[31m36.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0ma \u001b[36m0:00:01\u001b[0m\n",
      "\n",
      "\u001b[?25hRequirement already satisfied: MarkupSafe>=2.0 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from jinja2->torch>=1.6.0->sentence-transformers==2.2.2) (2.1.3)\n",
      "\n",
      "Requirement already satisfied: charset-normalizer<4,>=2 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from requests->huggingface-hub>=0.4.0->sentence-transformers==2.2.2) (3.3.2)\n",
      "\n",
      "Requirement already satisfied: idna<4,>=2.5 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from requests->huggingface-hub>=0.4.0->sentence-transformers==2.2.2) (3.4)\n",
      "\n",
      "Requirement already satisfied: urllib3<3,>=1.21.1 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from requests->huggingface-hub>=0.4.0->sentence-transformers==2.2.2) (2.1.0)\n",
      "\n",
      "Requirement already satisfied: certifi>=2017.4.17 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from requests->huggingface-hub>=0.4.0->sentence-transformers==2.2.2) (2023.11.17)\n",
      "\n",
      "Requirement already satisfied: mpmath>=0.19 in /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages (from sympy->torch>=1.6.0->sentence-transformers==2.2.2) (1.3.0)\n",
      "\n",
      "Building wheels for collected packages: sentence-transformers\n",
      "\n",
      "  Building wheel for sentence-transformers (setup.py) ... \u001b[?25ldone\n",
      "\n",
      "\u001b[?25h  Created wheel for sentence-transformers: filename=sentence_transformers-2.2.2-py3-none-any.whl size=125923 sha256=efb9a02fb84592fa23ee7f517976a394ccdedd63c307f1be709bd4a25f013d92\n",
      "\n",
      "  Stored in directory: /home/ma-user/.cache/pip/wheels/4f/9d/a5/9beabf87fdb3e143ae061b7bd2356c98c21bae0c908df108ee\n",
      "\n",
      "Successfully built sentence-transformers\n",
      "\n",
      "Installing collected packages: threadpoolctl, scipy, pillow, scikit-learn, torchvision, sentence-transformers\n",
      "\n",
      "Successfully installed pillow-10.1.0 scikit-learn-1.3.0 scipy-1.11.2 sentence-transformers-2.2.2 threadpoolctl-3.1.0 torchvision-0.15.2\n"
     ]
    }
   ],
   "source": [
    "!pip install transformers==4.30.2\n",
    "!pip install sentencepiece==0.1.99\n",
    "!pip install torch==2.0.1\n",
    "!pip install markdown==3.4.3\n",
    "!pip install faiss-gpu==1.7.2\n",
    "!pip install langchain==0.0.329 -i https://pypi.tuna.tsinghua.edu.cn/simple\n",
    "!pip install nltk==3.8.1 -i https://pypi.tuna.tsinghua.edu.cn/simple\n",
    "!pip install unstructured==0.10.24 -i https://pypi.tuna.tsinghua.edu.cn/simple\n",
    "!pip install sentence-transformers==2.2.2"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f6703d3e-8451-47e7-bac8-401cdb039be7",
   "metadata": {},
   "source": [
    "## 3. 中文文字匹配\n",
    "\n",
    "langchain支持很多中embedding模型，例如[text2vec-large-chinese](https://github.com/shibing624/text2vec)、[m3e-large](https://github.com/wangyingdong/m3e-base)、[bge-large-zh](https://github.com/jsonzhuwei/bge-large-zh)等，本文中使用text2vec-large-chinese模型实现。\n",
    "\n",
    "我们将待匹配的文字通过text2vec-large-chinese模型，转成嵌入向量，然后计算两个向量直接的相似性来进行匹配。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "8eea0c09-399f-4934-9c37-dd994cefb3b5",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import nltk\n",
    "work_dir = '/home/ma-user/work'\n",
    "docs_path = os.path.join(work_dir, 'docs')\n",
    "nltk.data.path.append(os.path.join(work_dir, 'nltk_data'))\n",
    "\n",
    "import numpy as np\n",
    "from nltk import data\n",
    "from langchain.vectorstores import FAISS\n",
    "from langchain.embeddings.huggingface import HuggingFaceEmbeddings\n",
    "from langchain.text_splitter import CharacterTextSplitter,MarkdownTextSplitter\n",
    "from langchain.document_loaders import UnstructuredFileLoader,UnstructuredMarkdownLoader"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "6cd3b3e0-4795-4573-a188-1d8acb9c16ae",
   "metadata": {},
   "outputs": [],
   "source": [
    "embedding_model = 'text2vec-large-chinese'\n",
    "\n",
    "#基于余弦相似性公式计算两个向量之间的相似度\n",
    "def get_cos_similar(v1: list, v2: list):\n",
    "    num = float(np.dot(v1, v2))  # 向量点乘\n",
    "    denom = np.linalg.norm(v1) * np.linalg.norm(v2)  # 求模长的乘积\n",
    "    return 0.5 + 0.5 * (num / denom) if denom != 0 else 0\n",
    "\n",
    "#加载text2vec-large-chinese模型\n",
    "def load_embeddings():\n",
    "    embedding_model_path = os.path.join(work_dir, embedding_model)\n",
    "    embeddings = HuggingFaceEmbeddings(model_name=embedding_model_path)\n",
    "    return embeddings\n",
    "\n",
    "#计算两段文字的相似度\n",
    "def get_embedding_sim(s1, s2, embeddings):\n",
    "    embedding1 = embeddings.embed_query(s1)#将文本转为向量\n",
    "    print('embedding1: ', len(embedding1))\n",
    "    embedding2 = embeddings.embed_query(s2)\n",
    "    sim = get_cos_similar(embedding1, embedding2)\n",
    "    print('sim of \\'{0}\\' and \\'{1}\\' is : {2}'.format(s1, s2, sim))\n",
    "    return sim"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "2b533d4c-fbe3-4224-97fc-d36fed4478e8",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
      "\n",
      "  from .autonotebook import tqdm as notebook_tqdm\n",
      "\n",
      "No sentence-transformers model found with name /home/ma-user/work/text2vec-large-chinese. Creating a new one with MEAN pooling.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "embedding1:  1024\n",
      "\n",
      "sim of '我今天心情很差' and '我今天很不开心' is : 0.8813112714604454\n",
      "\n",
      "embedding1:  1024\n",
      "\n",
      "sim of '我今天心情很差' and 'what are you弄啥嘞' is : 0.5937208396102376\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "0.5937208396102376"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sentence1 = \"我今天心情很差\"\n",
    "sentence2 = \"我今天很不开心\"\n",
    "sentence3 = \"what are you弄啥嘞\"\n",
    "embeddings = load_embeddings()\n",
    "get_embedding_sim(sentence1, sentence2, embeddings)\n",
    "get_embedding_sim(sentence1, sentence3, embeddings)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b2760277-0f61-4979-beef-44d7d7041311",
   "metadata": {},
   "source": [
    "## 4. 文档查询\n",
    "对于文档查询，我们首先也是将分割后的文档转成嵌入向量，然后存储到向量数据库，再根据查询条件，从向量数据库进行搜索。\n",
    "\n",
    "langchain支持的向量数据库有很多种，例如：[FAISS](https://github.com/facebookresearch/faiss)、[Milvus](https://github.com/milvus-io/milvus)、[PGVector](https://github.com/pgvector/pgvector)等，本文使用的是FAISS。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "931ca856-6c99-4ace-935e-3d80f6264f55",
   "metadata": {},
   "outputs": [],
   "source": [
    "#加载txt文件\n",
    "def load_txt_file(txt_file):    \n",
    "    loader = UnstructuredFileLoader(os.path.join(work_dir, txt_file))\n",
    "    docs = loader.load()\n",
    "    return docs\n",
    "\n",
    "#加载md文件\n",
    "def load_md_file(md_file):    \n",
    "    loader = UnstructuredMarkdownLoader(os.path.join(work_dir, md_file))\n",
    "    docs = loader.load()\n",
    "    return docs\n",
    "\n",
    "#分割txt文件\n",
    "def load_txt_splitter(txt_file, chunk_size=100, chunk_overlap=20):\n",
    "    docs = load_txt_file(txt_file)\n",
    "    text_splitter = CharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)\n",
    "    split_docs = text_splitter.split_documents(docs)\n",
    "    return split_docs\n",
    "\n",
    "#分割md文件\n",
    "def load_md_splitter(md_file, chunk_size=100, chunk_overlap=20):\n",
    "    docs = load_md_file(md_file)\n",
    "    text_splitter = MarkdownTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)\n",
    "    split_docs = text_splitter.split_documents(docs)\n",
    "    return split_docs\n",
    "\n",
    "#分割docs_path目录下的文件，并将其转为向量，放到FAISS向量数据库中\n",
    "def load_vector_store(docs_path):\n",
    "    split_docs = []\n",
    "    for doc in os.listdir(docs_path):\n",
    "        doc_path = f'{docs_path}/{doc}'\n",
    "        if doc_path.endswith('.txt'):\n",
    "            docs = load_txt_splitter(doc_path)\n",
    "            split_docs.extend(docs)\n",
    "        elif doc_path.endswith('.md'):\n",
    "            docs = load_md_splitter(doc_path)\n",
    "            split_docs.extend(docs)\n",
    "        else:\n",
    "            print('不支持的文件类型:', doc_path)\n",
    "            continue\n",
    "    embeddings = load_embeddings()\n",
    "    vector_store = FAISS.from_documents(split_docs, embeddings)\n",
    "    return vector_store\n",
    "\n",
    "#从向量数据集进行内容查询\n",
    "def sim_search(query, vector_store):\n",
    "    #similarity_search_with_score返回相似的文档内容和查询与文档的距离分数\n",
    "    #返回的距离分数是L2距离。因此，得分越低越好。\n",
    "    re = vector_store.similarity_search_with_score(query)\n",
    "    print('query result: ', re)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "e5e3ddf3-f704-433b-9c39-e8dab617fd2f",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Created a chunk of size 146, which is longer than the specified 100\n",
      "\n",
      "No sentence-transformers model found with name /home/ma-user/work/text2vec-large-chinese. Creating a new one with MEAN pooling.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "不支持的文件类型: /home/ma-user/work/docs/.ipynb_checkpoints\n",
      "\n",
      "query result:  [(Document(page_content='ModelBox支持两种方式运行，一种是服务化，一种是SDK，开发者可以按照下表选择相关的开发模式。', metadata={'source': '/home/ma-user/work/docs/modelbox.txt'}), 420.72437), (Document(page_content='2. SDK：ModelBox提供了ModelBox开发库，使用于扩展现有应用支持高性能AI推理，专注AI推理业务，支持c++，Python集成', metadata={'source': '/home/ma-user/work/docs/modelbox.txt'}), 587.23193), (Document(page_content='如果是第一次创建工程，在ModelBox', metadata={'source': '/home/ma-user/work/docs/第一个应用.md'}), 598.4208), (Document(page_content='也就是说，ModelBox的Pipeline模式，首先需要将应用的流程图构建出来，再分别实现图中的每个模块（ModelBox中称为“功能单元”），对于上面的视频应用：读取摄像头并输出原始画面，对应的', metadata={'source': '/home/ma-user/work/docs/第一个应用.md'}), 646.28265)]\n"
     ]
    }
   ],
   "source": [
    "query = \"ModelBox支持哪两种方式\"\n",
    "vector_store = load_vector_store(docs_path)\n",
    "sim_search(query, vector_store)"
   ]
  }
 ],
 "metadata": {
  "AIGalleryInfo": {
   "item_id": "0844abfc-df12-4b3c-a02b-08694652b441"
  },
  "flavorInfo": {
   "architecture": "X86_64",
   "category": "GPU"
  },
  "imageInfo": {
   "id": "d996b661-e127-48c4-a90a-fca29535f201",
   "name": "pytorch1.10-cuda10.2-cudnn7-ubuntu18.04"
  },
  "kernelspec": {
   "display_name": "python-3.7.10",
   "language": "python",
   "name": "python-3.7.10"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
